In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with 'Implementation' in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with 'Optional' in the header.
In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.
Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.
Visualize the German Traffic Signs Dataset. This is open ended, some suggestions include: plotting traffic signs images, plotting the count of each sign, etc. Be creative!
The pickled data is a dictionary with 4 key/value pairs:
# Import all the utility functions
import pickle
import random
from random import randint
import csv
import os
import time
# Import the data processing libs
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
%matplotlib inline
# Import OpenCV
import cv2
# Import the Deep Learning libs
import tensorflow as tf
from sklearn.preprocessing import LabelBinarizer
from skimage import exposure
from sklearn.model_selection import train_test_split
# Location of the pickle file
training_file = "traffic-signs-data/train.p"
testing_file = "traffic-signs-data/test.p"
# Load the data
with open(training_file, mode='rb') as f:
train = pickle.load(f)
with open(testing_file, mode='rb') as f:
test = pickle.load(f)
# Split features and labels
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']
### To start off let's do a basic data summary.
# Number of training examples
n_train = X_train.shape[0]
# Number of testing examples
n_test = X_test.shape[0]
# The shape of an image?
image_shape = (X_train.shape[1], X_train.shape[2])
# Number of classes are in the dataset
n_classes = len(set(y_train))
print("Number of training examples = {}".format(n_train))
print("Number of testing examples = {}".format(n_test))
print("Image data shape = {}".format(image_shape))
print("Number of classes = {}".format(n_classes))
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.
# Save the sign names in a dictionary
SignNames = {}
# Read the CSV file
with open('signnames.csv', 'r') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in spamreader:
if row[0] != 'ClassId':
SignNames[int(row[0])] = row[1]
""" DISPLAY SOME OF THE IMAGES """
def displaySomeImages(_x, _y):
# Get the list of classes
values, counts = np.unique(_y, return_counts=True)
# Loop through the classes and show an image per class
i = 0
fig = plt.figure(figsize=(18, 24))
gs = gridspec.GridSpec(8,6, wspace = 1., hspace = 1.)
for class_ in values:
# Get the image
idx = next(index for index, val in enumerate(_y) if val == class_)
img = _x[idx]
ax = fig.add_subplot(gs[i])
ax.imshow(img, cmap = 'gray')
ax.set_title(SignNames[_y[idx]])
i += 1
displaySomeImages(X_train, y_train)
# Plot image type distribution
def plot_examples_per_class(labels_):
values, counts = np.unique(labels_, return_counts=True)
plt.figure(figsize=(8, 3))
plt.bar(values, counts)
plt.ylabel('Num of images')
plt.xlabel('Class')
plt.title('Images per class')
plot_examples_per_class(y_train)
plot_examples_per_class(y_test)
Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the German Traffic Sign Dataset.
There are various aspects to consider when thinking about this problem:
Here is an example of a published baseline model on this problem. It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.
Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.
### Preprocess the data here.
### Feel free to use as many code cells as needed.
def grayscale(_x):
"""
Converts the array of images into grayscale
Parameters
----------
_x: The numpy array of images. Should be 4D (samples, h, w, d)
return: Numpy array of grayscale images
"""
ret = []
for i in np.arange(_x.shape[0]):
ret.append(cv2.cvtColor(_x[i,:,:,:], cv2.COLOR_BGR2GRAY))
return np.array(ret)
def equalize(_x):
"""
Equalizes the image histogram and also normalizes the image
Parameters
----------
_x: The numpy array of grayscale images. Should be 3D (samples, h, w)
return: Processed image data
"""
ret = []
for img in _x:
ret.append(exposure.equalize_adapthist(img))
return np.array(ret)
def OneHot(_x):
"""
Turns the labels into a one hot encoded vector
Parameters
----------
_x: The numpy array of labels. Should be 1D
"""
encoder = LabelBinarizer()
one_hot_encoded = encoder.fit_transform(_x)
return one_hot_encoded
print('Success: Preprocessing functions loaded.')
# Preprocess Data
X_train_nrm = equalize(grayscale(X_train))
X_test_nrm = equalize(grayscale(X_test))
y_train_ohe = OneHot(y_train)
y_test_ohe = OneHot(y_test)
print('Success: Data preprocessed.')
# Let's visualize some of our processed images
displaySomeImages(X_train_nrm, y_train)
# Let's visualize some of our processed images
displaySomeImages(X_test_nrm, y_test)
Describe the techniques used to preprocess the data.
Answer:
I found the skimage exposure to give the best performance as evidenced by the images above
### Generate data additional (if you want to!)
### and split the data into training/validation/testing sets here.
### Feel free to use as many code cells as needed.
# Function to generate transformed images
# from https://nbviewer.jupyter.org/github/vxy10/SCND_notebooks/blob/master/preprocessing_stuff/img_transform_NB.ipynb
def transform_image(img,ang_range,shear_range,trans_range):
'''
This function transforms images to generate new images.
The function takes in following arguments,
1- Image
2- ang_range: Range of angles for rotation
3- shear_range: Range of values to apply affine transform to
4- trans_range: Range of values to apply translations over.
A Random uniform distribution is used to generate different parameters for transformation
'''
# Rotation
ang_rot = np.random.uniform(ang_range)-ang_range/2
rows,cols = img.shape
Rot_M = cv2.getRotationMatrix2D((cols/2,rows/2),ang_rot,1)
# Translation
tr_x = trans_range*np.random.uniform()-trans_range/2
tr_y = trans_range*np.random.uniform()-trans_range/2
Trans_M = np.float32([[1,0,tr_x],[0,1,tr_y]])
# Shear
pts1 = np.float32([[5,5],[20,5],[5,20]])
pt1 = 5+shear_range*np.random.uniform()-shear_range/2
pt2 = 20+shear_range*np.random.uniform()-shear_range/2
pts2 = np.float32([[pt1,5],[pt2,pt1],[5,pt2]])
shear_M = cv2.getAffineTransform(pts1,pts2)
# Perform the transformations
img = cv2.warpAffine(img,Rot_M,(cols,rows))
img = cv2.warpAffine(img,Trans_M,(cols,rows))
img = cv2.warpAffine(img,shear_M,(cols,rows))
return img
print('Success: transform_image function loaded.')
# Example transformation
image = random.choice(X_train_nrm)
image = transform_image(image,20,10,5)
plt.imshow(image, cmap='gray')
# figure out what's the max number of examples a class has
values, counts = np.unique(y_train, return_counts=True)
IMG_PER_CLASS = 2500 # Set this to 0 to not generate any new images
# Transform the data only if it's not yet already pickled...
pickle_file = 'transformed_data.pickle'
for class_ in values:
print('working on class #', class_, '...')
num_img_needed = IMG_PER_CLASS - counts[class_]
# Create a generator to fetch the indexes corresponding to each class
gen = (index for index, val in enumerate(y_train) if val==class_)
# Reshape the class
class_ = class_.reshape([1])
# Continue as long as more images are needed
for num in np.arange(num_img_needed):
try:
# Fetch an image from the class
first_example = X_train_nrm[next(gen)]
except:
# Recreate the generator and try
gen = (index for index, val in enumerate(y_train) if val==class_)
first_example = X_train_nrm[next(gen)]
# Transform the image
transformed_example = transform_image(first_example,20,10,5)
# Reshape it and add it back to the training set
transformed_example = transformed_example.reshape([1, 32, 32])
X_train_nrm = np.concatenate([X_train_nrm, transformed_example])
y_train = np.concatenate([y_train, class_])
# Reform the OHE for the training labels
y_train_ohe = OneHot(y_train)
# Plot all the classes to ensure that they are equalized
plot_examples_per_class(y_train)
# Always save the data for easy access
print('Saving data to pickle file...')
try:
with open('transformed_data.pickle', 'wb') as pfile:
pickle.dump(
{
'X_train': X_train_nrm,
'y_train': y_train_ohe,
'X_test': X_test_nrm,
'y_test': y_test_ohe
},
pfile, pickle.HIGHEST_PROTOCOL)
except Exception as e:
print('Unable to save data to', pickle_file, ':', e)
raise
print('Data cached in pickle file.')
It's sufficient to execute from this point for re-running the model...
# Reload the data
pickle_file = 'transformed_data.pickle'
with open(pickle_file, 'rb') as f:
pickle_data = pickle.load(f)
X_train_nrm = pickle_data['X_train']
y_train_ohe = pickle_data['y_train']
X_test_nrm = pickle_data['X_test']
y_test_ohe = pickle_data['y_test']
del pickle_data # Free up memory
print('Training and test data loaded')
# Check that the labels are now evenly distributed
plot_examples_per_class(y_train)
def split_train_and_val(_x, _y, split=0.2):
"""
Split the training data into training and cross-validation sets
Parameters
----------
_x, _y: The features and labels of the training data
"""
(train_features, valid_features, train_labels, valid_labels) = \
train_test_split(_x, _y, test_size = split,
random_state=randint(0,np.power(2,31)), stratify = _y)
return (train_features, valid_features, train_labels, valid_labels)
print('Success: Data split functions loaded.')
(train_features, valid_features, train_labels, valid_labels) = \
split_train_and_val(X_train_nrm, y_train_ohe)
test_features = X_test_nrm
test_labels = y_test_ohe
print('Success: Data sets generated.')
Describe how you set up the training, validation and testing data for your model. If you generated additional data, why?
Answer:
Train / Validation / Test: I split the training data into 80% - 20% for training and cross-validation, purely out of tradition.
Additional data: I did not generate additional data, as the time consumption was very high.
### Define your architecture here.
### Feel free to use as many code cells as needed.
# Convolutional Neural Network
# Based on https://github.com/aymericdamien/TensorFlow-Examples - convolutional_network
# and "Traffice Sign Recognition with Multi-Scale Convolutional Networks" by Sermanet & LeCun
# Parameters
n_hidden_1 = 128 # Layer 1 with 100 hidden units
n_hidden_2 = 128 # Layer 2 with 100 hidden units
# Network Parameters
n_input = image_shape[0] * image_shape[1] # img shape: 32*32
n_classes = n_classes # total classes
init_stdev = 0.05 # Std. Dev of weights initialization
# Sizing parameters
pool_sizeNstride = 2 #pooling size & stride
input_channels = 1 #grayscale
input_size = image_shape[0] #size of 1 side of input square
wc1_convoSize = 5 #size of 1 side of convo sq
wc1_outDepth = 12 #convo output depth
wc1_outSize = int(input_size / pool_sizeNstride) #size of 1 size of output sq
wc2_convoSize = 5
wc2_outDepth = 48
wc2_outSize = int(wc1_outSize / pool_sizeNstride)
wc1_total = wc1_outSize * wc1_outSize * wc1_outDepth
wc2_total = wc2_outSize * wc2_outSize * wc2_outDepth
multiscale_len = wc1_total + wc2_total #mutli-scale feature results
# tf Graph input
x = tf.placeholder(tf.float32, [None, image_shape[0], image_shape[1]])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32) #dropout (keep probability)
# Down-sampling
def maxpool2d(x_, k_=2):
# MaxPool2D wrapper
return tf.nn.max_pool(x_, ksize=[1, k_, k_, 1], strides=[1, k_, k_, 1],
padding='SAME')
# Create some wrappers for simplicity
def conv2d(x_, W_, b_, drop_, strides_=1):
x = tf.nn.conv2d(x_, W_, strides=[1, strides_, strides_, 1], padding='SAME')
x = tf.nn.bias_add(x, b_)
x = tf.nn.relu(x)
x = tf.nn.dropout(x, drop_)
x = maxpool2d(x, pool_sizeNstride)
return x
# Create model
def model(x_, weights_, biases_, dropout_):
# Reshape input picture
x = tf.reshape(x_, shape=[-1, image_shape[0], image_shape[1], 1])
# Convolution Layer 1
conv1 = conv2d(x, weights_['wc1'], biases_['bc1'], dropout_)
# Convolution Layer 2
conv2 = conv2d(conv1, weights_['wc2'], biases_['bc2'], dropout_)
# Input to classifier
flat1 = tf.reshape(conv1, [-1, wc1_total]) #row size should be batch size
flat2 = tf.reshape(conv2, [-1, wc2_total]) #row size should be batch size
classifier_input = tf.concat(1, [flat1, flat2]) #should concat the rows
# Hidden layer with RELU activation
h1 = tf.add(tf.matmul(classifier_input, weights['wh1']), biases['bh1']) #WX+b
h1 = tf.nn.relu(h1)
h1 = tf.nn.dropout(h1, dropout_)
# Hidden layer with RELU activation
h2 = tf.add(tf.matmul(h1, weights['wh2']), biases['bh2']) #WX+b
h2 = tf.nn.relu(h2)
h2 = tf.nn.dropout(h2, dropout_)
# Output, 'classifier'
out = tf.add(tf.matmul(h2, weights_['out']), biases_['out'])
return out
# Store layers weight & bias
weights = {
'wc1': tf.Variable(tf.random_normal([wc1_convoSize, wc1_convoSize, input_channels, wc1_outDepth], stddev=init_stdev)),
'wc2': tf.Variable(tf.random_normal([wc2_convoSize, wc2_convoSize, wc1_outDepth, wc2_outDepth], stddev=init_stdev)),
'wh1': tf.Variable(tf.random_normal([multiscale_len, n_hidden_1], stddev=init_stdev)),
'wh2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2], stddev=init_stdev)),
'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes], stddev=init_stdev))
}
biases = {
'bc1': tf.Variable(tf.random_normal([wc1_outDepth], stddev=init_stdev)),
'bc2': tf.Variable(tf.random_normal([wc2_outDepth], stddev=init_stdev)),
'bh1': tf.Variable(tf.random_normal([n_hidden_1], stddev=init_stdev)),
'bh2': tf.Variable(tf.random_normal([n_hidden_2], stddev=init_stdev)),
'out': tf.Variable(tf.random_normal([n_classes], stddev=init_stdev))
}
# Construct model
pred = model(x, weights, biases, keep_prob)
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
# L2 regularization for the fully connected parameters.
regularizers = tf.nn.l2_loss(weights['wc1']) + tf.nn.l2_loss(biases['bc1']) + \
tf.nn.l2_loss(weights['wc2']) + tf.nn.l2_loss(biases['bc2']) + \
tf.nn.l2_loss(weights['wh1']) + tf.nn.l2_loss(biases['bh1']) + \
tf.nn.l2_loss(weights['wh2']) + tf.nn.l2_loss(biases['bh2']) + \
tf.nn.l2_loss(weights['out']) + tf.nn.l2_loss(biases['out'])
# Add the regularization term to the loss.
beta = 1e-4
cost += beta * regularizers
print('Success: ConvNet defined.')
What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.) For reference on how to build a deep neural network using TensorFlow, see Deep Neural Network in TensorFlow from the classroom.
Answer:
Model Architecture: Multi-scale CNN
Convolutional part of 2 layers: Conv (5x5x12) -> Pool (2) -> Relu -> Conv (5x5x48) -> Pool (2) -> Relu
Fully Connected part: Takes multi-scale (concatnates the output of both conv layers) as input, 2 layers of 128 neurons each
Most of the decisions are derived from the paper along with some changes such as the 100 to 128 neurons based on emperical results

### Train your model here.
### Feel free to use as many code cells as needed.
def evaluateModel(_x, _y, bat_sz):
"""
Evaluates the model using either the validation or the
test set by using batches instead of running them one
shot. Makes it easier for bigger network architectures
Parameters
-----------
_x, _y: The features and the labels
bat_sz: The batch size
return: The average (cost, accuracy) as a tuple
"""
total_batch = int(_x.shape[0] / bat_sz)
total_cost = 0.
total_acc = 0.
# Loop over all batches
for i in np.arange(total_batch):
offset = i * bat_sz
batch_x = _x[offset : offset + bat_sz]
batch_y = _y[offset : offset + bat_sz]
# Evaluate the cost and the accuracy
_cost, _acc = sess.run([cost, accuracy], feed_dict={x: batch_x, y: batch_y, keep_prob: 1.0})
# Total up the values
total_cost += _cost
total_acc += _acc
# Return the average values
return (total_cost / total_batch, total_acc * 100. / total_batch)
# Tune-ing variables
EPOCHS = 501
BAT_SZ = 4096
dropout = 0.5 # Dropout, probability to keep units
# Display variables
display_step = int(EPOCHS / 10)
cost_batch = [] # plotting array
batch_avg_loss, batch_avg_acc = 0., 0.
prev_batch_loss = 0.
optimizer = tf.train.AdamOptimizer().minimize(cost)
# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Initializing the variables
init = tf.initialize_all_variables()
# Print the table format
print('{:^20}'.format('Minibatch'),end="")
print('{:^20}'.format('Time'),'{:^20}'.format('Loss'),end="")
print('{:^20}'.format('Train accuracy'),'{:^20}'.format('Validation accuracy'))
# Compute the number of batches required for each epoch
num_batches = int(train_features.shape[0] / BAT_SZ)
# Launch the graph
with tf.Session() as sess:
# Intialize the session
sess.run(init)
# Compute the current time
start_time = time.clock()
# For each Epoch
for epoch in np.arange(EPOCHS):
# Loop over each batches
for i in np.arange(num_batches):
offset = i * BAT_SZ
batch_x = train_features[offset : offset + BAT_SZ]
batch_y = train_labels[offset : offset + BAT_SZ]
# Run optimization op (backprop) and cost op (to get loss value)
_, _c, _acc = sess.run([optimizer, cost, accuracy],
feed_dict={x: batch_x, y: batch_y, keep_prob: dropout})
# Compute totals so we can average later
batch_avg_loss += _c
batch_avg_acc += _acc
# Display logs per epoch step
if epoch % display_step == 0:
# Calculate the averages for the training example
batch_avg_loss /= (display_step * num_batches)
batch_avg_acc /= (display_step * num_batches)
# Evaluate the validation set
_vc, _vacc = evaluateModel(valid_features, valid_labels, BAT_SZ)
# Save the cost
cost_batch.append(_vacc)
# Print the values
print('{:^20}'.format(str(epoch)), end = "")
print('{:^20}'.format(str(time.clock() - start_time)), end = "")
print('{:^20}'.format(str(batch_avg_loss)), end = "")
print('{:^20}'.format(str(batch_avg_acc * 100)), end = "")
print('{:^20}'.format(str(_vacc)))
# Reset the values to 0
batch_avg_loss = 0.
batch_avg_acc = 0.
start_time = time.clock()
print("Optimization Finished!")
# Test model
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
# Calculate accuracy - run test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Test Accuracy:", accuracy.eval({x: test_features, y: test_labels, keep_prob: 1.}) * 100.)
# Save the variables to disk
save_path = saver.save(sess, "model.ckpt")
print("Model saved in file: %s" % save_path)
# plot
plt.plot(cost_batch)
plt.ylabel('cost')
plt.xlabel('epoch')
plt.show()
How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)
Answer:
Optimizer: Adam (for the convinience of reducing hyper parameters)
Batch Size: The maximum that'll fit my GPU (4098)
Epochs: 500 (for time)
Hyper-parameters:
What approach did you take in coming up with a solution to this problem?
Answer:
My steps
Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.
You may find signnames.csv useful as it contains mappings from the class id (integer) to the actual sign name.
Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.
### Load the images and plot them here.
### Feel free to use as many code cells as needed.
def img_import(file_path_):
image = cv2.imread(file_path_)
# make the image small in the middle
image = cv2.resize(image, (32, 32))
return image
print('Success: new image helper functions defined.')
# Load and show images
# images from https://en.wikipedia.org/wiki/Road_signs_in_the_United_States
img_folder = 'test-data'
img_names = os.listdir(img_folder)
extra_imgs = []
extra_labels = []
for name in img_names:
label = int(name[:name.find('.')])
extra_imgs.append(img_import(img_folder + "/" + name))
extra_labels.append(label)
fig = plt.figure()
a=fig.add_subplot(1,2,1)
a.set_title(SignNames[label])
plt.imshow(extra_imgs[-1])
# display an image from the training set that matches
training_index = next(index for index, val in enumerate(y_train) if val == label)
a=fig.add_subplot(1,2,2)
a.set_title(SignNames[label])
plt.imshow(X_train[training_index])
extra_imgs = np.array(extra_imgs)
extra_labels = np.array(extra_labels)
Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It would be helpful to plot the images in the notebook.
Answer:
I've taken 5 images from the https://en.wikipedia.org/wiki/Road_signs_in_Germany. Given that I'm reusing the german traffic signs, it should be a pretty good match. Some difficulties might be
Beyond this, as was pointed by my first project reviewer, some more difficulties might be
So in all, this and other real world images will be a difficult set for the model to predict.
### Run the predictions here.
### Feel free to use as many code cells as needed.
# Normalize the one hot the new images
extra_nrm = np.resize(equalize(grayscale(extra_imgs)),(5,32,32))
extra_ohe = np.zeros((5,43))
for i in np.arange(extra_labels.shape[0]):
extra_ohe[i][extra_labels[i]] = 1.0
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
pred_val = tf.argmax(pred, 1)
pred_softmax = tf.nn.softmax(pred)
pred_topFive = tf.nn.top_k(pred_softmax, k = 5)
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "model.ckpt")
print("Model restored.")
# Get the predictions
extra_soln_vals, extra_soln_topFive = sess.run([pred_val, pred_topFive], feed_dict={x: extra_nrm, keep_prob: 1.})
# Print out the results
print('Actuals: ', extra_labels)
print('Soln_vals: ', extra_soln_vals)
# Get 5 images of the same class from the test set and let's see how well our model precits
test_imgs = []
test_labels = []
for i in np.arange(extra_nrm.shape[0]):
index = next(index for index, val in enumerate(y_test) if val == extra_labels[i])
test_imgs.append(X_test_nrm[index])
test_labels.append(y_test[index])
test_imgs = np.array(test_imgs)
test_labels = np.array(test_labels)
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
pred_val = tf.argmax(pred, 1)
pred_softmax = tf.nn.softmax(pred)
pred_topFive = tf.nn.top_k(pred_softmax, k = 5)
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "model.ckpt")
print("Model restored.")
# Get the predictions
test_soln_vals, test_soln_topFive = sess.run([pred_val, pred_topFive], feed_dict={x: test_imgs, keep_prob: 1.})
# Print out the results
print('Actuals: ', test_labels)
print('Soln_vals: ', test_soln_vals)
# Display the results
for i in np.arange(extra_nrm.shape[0]):
fig = plt.figure()
# Show the extra image
a=fig.add_subplot(1,4,1)
if i == 0: a.set_title("Extra Image")
plt.imshow(extra_nrm[i], cmap='gray')
# display an image from the training set that matches the extra prediction
a=fig.add_subplot(1,4,2)
training_index = next(index for index, val in enumerate(y_train) if val == extra_soln_vals[i])
if i == 0: a.set_title("Extra Prediction")
plt.imshow(X_train[training_index])
# Show the test image
a=fig.add_subplot(1,4,3)
if i == 0: a.set_title("Test Image")
plt.imshow(test_imgs[i], cmap='gray')
# display an image from the training set that matches the test prediction
a=fig.add_subplot(1,4,4)
training_index = next(index for index, val in enumerate(y_train) if val == test_soln_vals[i])
if i == 0: a.set_title("Test Prediction")
plt.imshow(X_train[training_index])
Is your model able to perform equally well on captured pictures when compared to testing on the dataset?
Answer:
It does a decent job, only 1 classification missed. But for the missed case, it get's it absolutely wrong (does not even predict some other speed sign). For the corresponding image from the test set, it does really well. As was discussed before we started predicting, the scale of the images are vastly different. Considering that in the speed signs, the classifier needs to get a lot more information than primitive shapes, the scale starts to affect a lot more. I'm surprised that it gets the "priority road" sign as I expected that to be a tough test case.
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.
fig = plt.figure(figsize=(15, 20))
for i in np.arange(extra_nrm.shape[0]):
a = fig.add_subplot(5,2,(i * 2) + 1)
if i == 0: a.set_title("Extra Image")
plt.bar(extra_soln_topFive.indices[i][:], extra_soln_topFive.values[i], color="blue")
plt.axis([0, 43, 0, 1])
a = fig.add_subplot(5,2,(i * 2) + 2)
if i == 0: a.set_title("Test Image")
plt.bar(test_soln_topFive.indices[i][:], test_soln_topFive.values[i] , color="blue")
plt.axis([0, 43, 0, 1])
Use the model's softmax probabilities to visualize the certainty of its predictions, tf.nn.top_k could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)
Answer:
For 1 Ah! the model barely manages to predict. This was a tough test case, one I expected the model to fail and it almost does. The reason here might be that it's an yellow image (the only one in the dataset). Considering that we arent using color, we could be putting the model at a disadvantage here
For 3 The actual correct answer does not coming in the top 5 at all. And you can see that the model is not at all confident with it's prediction and hence ends up making the mistake. My guess would be that speed indicators especially when they vary in size, is difficult for the model to predict. So it ends up matching the shape of the sign (a circle). All the top 5 models are circle signs. And it does predict 50km/h as the third highest prediction.
Note: In both the cases, the test set does explicitly well (no confusion at all for the model). This seems to indicate very much on the scale of the extra image that messes up the prediction
If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images.
Answer:
Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to \n", "File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.